Metadata harvesting in regional digital libraries in PIONIER Network
نویسندگان
چکیده
The national programme “PIONIER Polish Optical Internet. Advanced Applications, Services and Technologies for Information Society” has been realized in Poland since 2001. One of its main focus was to enrich the content based services in Polish NREN and to reach this goal several digital library installation have been started up. This activity aimed at assistance of librarians and university publishers in digital content management and publishing. However to reflect their expectations (e.g. authorized, local access to academic scripts, identification of the owner of manuscripts, preservation of regional cultural heritage, etc.) PIONIER introduced the concept of regional digital libraries, starting in 2002 with Digital Library of Wielkopolska Region [1]. PIONIER regional digital libraries currently cover the installations based on dLibra software [2]. dLibra is a portable and distributed digital library software prepared to support an entire publication lifecycle. It was developed in cooperation with librarians from a various university libraries and gives users a lot of possibilities from basic library content browsing to RSS [3] based notifications and directory based metadata search. To the end of 2005 there have been four regional digital libraries and additionally several institutional installations deployed in PIONIER. The dLibra Digital Library Framework is a platform developed to be an easily adjustable software basis for digital libraries. The dLibra project has been started in 1999 in the Poznan Supercomputing and Networking Center and for now it has become the most popular Polish digital library framework. The dLibra platform consists of six specialized portable servers, creating together complex digital library system [4]. Using dLibra client applications a user can store digital objects of any type, such as text (PDF, DjVu, HTML, etc.), images, audio or video. All stored objects can be precisely described with the adjustable set of metadata attributes. There are sophisticated mechanisms supporting creation of the metadata, like dictionaries of attribute values with thesaurus functionality. The dLibra supports many well known standards like MARC[5], RDF[6] and DublinCore[7], which are used for the metadata exchange. When a user wants to access content gathered in the dLibra – based digital library, he/she can use WWW pages generated by the dLibra framework. These WWW pages allows to easily browse and search all digital collections of given digital library and other installations through OAI-PMH interface [8]. Access to all gathered assets is precisely controlled by one of dLibra servers. In this paper we will address the issue of communication between digital libraries in the sense of the exploration of metadata and information about library structure. The latest functionality provided for all PIONIER digital libraries included implementation of OAI-PMH protocol, which transformed the set of regional digital libraries into the distributed platform where each of digital library became an access point to all regional resources stored in PIONIER digital libraries. The service which handles the communication with other repositories is the dLibra 1 The complete list of digital libraries in PIONIER is available at http://dlibra.psnc.pl) Distributed Search Server. It is used to harvest remote dLibra instances by means of the OAIPMH protocol. It also gives the user a possibility to search through gathered remote metadata. In fact, any OAI-PMH-enabled repository can be harvested and searched using that service. The deployment of OAI-PMH protocol enables communication with other digital library systems – not only those based on dLibra software. In the paper we will reference to other examples of virtual collections [9][10] and mention similar solutions realized in other countries, however neither of them offers such level of unification of access to distributes resources managed by the same kind of digital library framework. We will conclude the paper with examples of content-based services, which are enabled through the PIONIER platform for distributed regional digital libraries and which are provided for research and education users. There are services such as: virtual collections of regional cultural heritage, distributed exhibitions, scientific comments and annotations for group of digital resources, etc. Another group of complementary services covers also information services provided by Grid environments [11]. PIONIER is currently providing an access to more than 10.000 of digital publications in its regional digital libraries. It is already a huge potential for research and education activity, which is also stimulating the development of regional RENs through building new regional services.
منابع مشابه
Distributed Digital Libraries Platform in the PIONIER Network
The dLibra Digital Library Framework (http://dlibra.psnc.pl/) is a Polish digital library software platform developed by Poznan Supercomputing and Networking Center as a part of the PIONIER programme (http://www.pionier.gov.pl/). The dLibra project was started in 1999, as a part of research in the field of digital libraries started in PSNC in 1996. The developed platform is currently the most p...
متن کاملAn RDF Model for Multi-level Hypertext in Digital Libraries
A core concept of the Semantic Web 1 is to enrich Web documents with machine-readable metadata (i. e. data describing the resources). Such metadata and the corresponding documents are already provided by a growing number of digital libraries on the Web. The Open Archives Initiative Protocol for Metadata Harvesting ([VdSL01]) provides a standard for harvesting metadata from any digital library t...
متن کاملBuilding Digital Libraries from Simple Building Blocks Authors
Metadata harvesting has been established by the Open Archives Initiative (OAI) as a viable mechanism for connecting a provider of data to a purveyor of services. The Open Digital Library (ODL) model is an emerging framework which attempts to break up the services into appropriate components based also on the basic philosophy of the OAI model. This framework has been applied to various projects ...
متن کاملmod_oai: An Apache Module for Metadata Harvesting
We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH is the de facto standard for metadata exchange in digital libraries and allows repositories to expose their contents in a structured, application-neutral format with semantics optimized for accurate incremental harvesting. mod_oai differs from other OAIPMH i...
متن کاملMetadata Harvesting with R and OAI-PMH
The Open Archives Initiative (http://www.openarchives.org/) develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. One key project is the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH, http: //www.openarchives.org/pmh/) which provides “a low-barrier mechanism for repository interoperability” for archives (institutiona...
متن کامل